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Classical statistical process control often relies on univariate char- 
acteristics. In many contemporary applications, however, the quality 
of products must be characterized by some functional relation be- 
Cn ' tween a response variable and its explanatory variables. Monitoring 

such functional profiles has been a rapidly growing field due to in- 
f^ , . creasing demands. This paper develops a novel nonparametric L-1 

^^ ' location-scale model to screen the shapes of profiles. The model is 

^^ built on three basic elements: location shifts, local shape distortions, 

and overall shape deviations, which are quantified by three individual 
metrics. The proposed approach is applied to the previously analyzed 
'^ ,, vertical density profile data, leading to some interesting insights. 



C^ 



1. Introduction. Since its initial introduction by Shewart in the 1920s, 
statistical process control (SPC) has received increasing attention from both 
\^ ' academia and industry. Traditional SPC often utilizes a single metric (e.g., 

^O , mass or length) to characterize products under inspection. For such metrics, 

lower and upper control limits are then estimated from manufacturing data. 



^ 



C^ ■ For example, such limits could be defined by the mean plus and minus 



o 

(N 



three standard deviations. If its metric falls out of the control limits, it 
could imply a potential change in the underlying distribution (for stability) , 
or a product could be rejected due to a potential quality deficiency (for 
conventional quality control). 

In traditional SPC, the quality of a product is often assumed to be ad- 



j^ , equately characterized by univariate characteristics or certain metrics. In 
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recent years, however, there have been increasing needs for profile control 
charts for which the responses are no longer single measurements but, rather, 
functions of one or several covariates X. Traditional methods handling uni- 
variate or multivariate charts are no longer applicable to such functional 
profile responses [Woodall (2007)]. Several new approaches have been sub- 
sequently proposed. One basic class of methods assumes a linear association 
between a response Y and its covariate X and then constructs the corre- 
sponding control charts on the intercept, slope and variance. When linear- 
ity is not an option, nonlinear models are usually introduced. The control 
charts are then built upon one or several model coefficients. Jensen, Birch 
and Woodall (2008), Jensen and Birch (2009) further consider parametric 
(linear and nonlinear) mixed effect models to take into account intrapro- 
file correlations. Qiu, Zou and Wang (2010) further consider nonparametric 
mixed effects modeling Phase II profile monitoring. 

In many applications, it is difficult to know a priori the shape of a re- 
sponse profile. Inappropriate shape assumptions can lead to substantial es- 
timation bias. To overcome this difficulty, Reis and Saraiva (2006), Jeong, 
Lu and Wang (2006), Ding, Zeng and Zhou (2006), Zhou, Sun and Shi (2007) 
and Chicken, Pignatiello and Simpson (2009) explore nonparametric wavelet 
models and construct control charts based on a portion of the wavelet co- 
efficients. Since the screening method therein relies only on major wavelet 
coefficients, any deviations of the other coefficients may be undetectable. 
Zou, Tsung and Wang (2008) also explore an alternative nonparametric ap- 
proach for profile monitoring in which the measures within each profile are 
assumed to be independent. 

The above approaches always project the profile information onto a set 
of parameters, while a more natural method is to utilize the entire infor- 
mation of the target profiles. We hence propose estimating a reference pro- 
file based on Phase I data and then monitoring the deviations of individ- 
ual profiles in Phase II from the reference one. The basic statistical tool 
we use to model Phase I data is L-1 regression assuming a nonparametric 
location-scale model with a general class of error structure. Compared with 
traditional L-2 regression, L-1 regression is more robust against outliers 
and heavy-tailed distributions. We refer to Koenker (2005) for an exten- 
sive exposition on the properties of L-1 regression. Based on the estimated 
model in Phase I, we propose three deviation measures to monitor individ- 
ual Phase II profiles in overall location shift, local failure and overall shape 
deviation. The control limits of the three deviation metrics are determined 
based on the empirical estimation of their asymptotic distributions. 

As an illustrative example, we apply the proposed method to the vertical 
density profile (VDP) data of Walker and Wright (2002). In that study, the 
manufacturers of engineered wood boards are very concerned about fiber- 
board density, which determines the fiberboard's strength and physical prop- 
erties. The density (Y) is read by a profilometer, a laser device measuring 
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Fig. 1. Walker-Wright data. 

densities at equispaced points (X) along a designated vertical line. Each 
resulting profile consists of 314 density measurements, and the distance be- 
tween two consecutive measures is 0.002 inch. Figure 1 provides a visual 
illustration of the data set. These curves are clearly nonlinear, and thus new 
technologies to monitor these curves are desirable. 

The rest of our paper is organized as follows. Section 2 introduces a non- 
parametric location-scale L-1 regression with a generalized error structure, 
as well as the estimation procedure. We then describe in detail how to 
construct profile control charts utilizing nonparametric L-1 regression. The 
screening is based on three deviation measures, whose asymptotic proper- 
ties are established. The proposed method is then applied to the VDP data 
and illustrated in Section 3.1. Section 3.2 compares the proposed method to 
alternative approaches, and Section 3.3 provides a numerical investigation. 
Section 4 presents concluding remarks and a discussion. The theoretical reg- 
ularity conditions for the proposed method are summarized in the Appendix, 
while the proof of the main theorem is presented in the supplementary ma- 
terials. 

2. Proposed methodology. Two phases, often denoted by Phases I and II, 
are typically involved in SPC [Woodall (2007)]. The goal in Phase I is to 
construct the control limits which determine if a process has been in control 
over the period of time. Phase II then applies these limits to detect a poten- 
tial change in the underlying distribution (for stability). This paper follows 
the same convention. 

2.1. Modeling Phase I profiles. 



2.1.1. Representation of Phase I profiles based on a family of nonparamet- 
ric location-scale models. Suppose there exist n independent Phase I pro- 
files, {Yj = (li,i, li^2) • • • ) yi,mi),'i' = 1) • • • ) n}, where rui denotes the number 



4 Y. WEI, Z. ZHAO AND D. K. J. LIN 

of elements. Let Xij represent the location where Yij is taken. For instance, 
if Yj is a time sequence, then Xij can be the underlying measurement time. 
Given this notation, we assume that Phase I profiles follow a nonparametric 
location-scale model 

(1) Yij = 6i + fi{xij) + s{xij)eij, l<j<ini,l<i<n. 

Here we define 6i = median(Yj) as the marginal median of the ith profile, 
and view it as the profile center. We also assume that, for each profile i, 
the error process {ejjjjgN in (1) is an independent copy from a stationary 
process with a sufficiently general dependence structure (as specified in Con- 
dition 1 in the Appendix). Such a proposed error structure is well suited for 
control chart profiles, mainly because of two factors: (1) it allows for high 
correlations induced by dense measurements, differentiating it from typical 
longitudinal data settings, and (2) unlike classical time sequences, it allows 
the dependence of both left and right neighboring measurements. The details 
of the error structure are discussed in the Appendix. We further assume that 
median(eij) = and median(|ejj|) = 1. Under these assumptions, n{x) is the 
conditional median of a centered profile [Y(a;) — median{Y(x)}] given the 
location x, where Y(x) is a random profile satisfying model (1). It represents 
the standard shape of a normative centered response profile, and we call it 
the reference profile. The function s{x) is the conditional median absolute 
deviation (MAD) of the centered profile given the location x. It measures the 
extent to which a normative profile can deviate from the reference profile at 
a given location x, and we call this the reference deviation function. Hence, 
we decompose the profiles into three domains: center, shape, and variability. 

2.1.2. Stepwise estimation. The key components of model (1) are the 
profile centers 6i, a reference profile /x(x) and a reference deviation func- 
tion s{x). These can be estimated sequentially as follows. 

Step 1: Estimation of 6i. In a simpler case, where all the profiles are mea- 
sured on a set of fixed evenly-spaced locations with mi = m, the profile- 
specific centers, 6i, can be estimated by taking the sample median over the 
observed {Yij,j = 1, . . . ,m} for each profile. This type of profiling typically 
occurs in manufacturing studies such as the VDP profiles introduced earlier. 
In more general applications, the number of measurements each sequence 
contains, rrij, can vary across profiles, the locations Xij can be unevenly 
spaced, and their spacing can also vary across profiles. To handle such vary- 
ing location profiles, we can assume that each measurement time/location 
is a random draw from an underlying distribution F{x), and the observed 
locations Xij are the order statistics of rrii random draws. Letting X be 
the random variable following the distribution F{x), we then define the 
profile center as 5i = avgmmQ EY^(x)\^i{^) ~ ^1) where Yi{x) is the under- 
lying profile such that Yij = Yi{xij). Note that since EY^(x)\Yi{X) — 9\ = 
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Ex[EYi(x){\^ii^) ~ ^ll-'^ = 2;}], the individual center 6i can be estimated by 
minimizing the sample objective function ^ • \Yij — 6\f{xij), where /(xjj) 
is the density of X at location Xij and can be estimated from the pooled 
sample Xij (over both i and j). When Xij are evenly spaced, f{x) is a uni- 
form density, that is, f{xij) = f{xij/) for any j ^ j' . Hence, the center can 
be simply estimated as the sample median. 

Step 2: Estimation of n{x). A kernel-based estimation procedure is em- 
ployed and the algorithm details are as follows. Under the specified error 
structure and other mild conditions (as listed in the Appendix), the result- 
ing estimated functions are uniformly consistent and asymptotically normal. 
Let Ki,^{u) = K{u/bn) be a nonnegative kernel function with bandwidth 
bn> that satisfies J^K{u) du= 1. We propose the following least absolute 
deviation (LAD) estimation for the median function fJ,{x): 

n nii 

(2) iJ-bSx) = argmin^^ iFij -6i- e\Kb^{xij - x). 

^ t=i j=i 

The estimation equation (2) is a locally constant type. One can also extend 
it to a locally linear estimate, as in Fan and Gijbels (1996), without much 
technical difficulty. We settle on the locally constant approach mainly for 
computational simplicity. 

Following Theorem 1 of Wei, Zhao and Lin (2011), {fib„{x) — /^(x)} con- 
tains a bias term of order 0(6^). To remove the bias, we adopt a corrective 
jackknife estimator [Wu and Zhao (2007)]: 

(3) fj'b„ (x) = 2fih„ (x) - fi^^^ (x) . 

The bias-corrected estimator jlh^ is uniformly consistent and normally dis- 
tributed for any x asymptotically [Wei, Zhao and Lin (2011)]. 

Step 3: Estimation of s{x) . The reference deviation function s{x) is then es- 
timated from the residuals in the preceding step. Notice that median(|ejj|) = 
1 entails median(|yjj — di — fj,{x)\\xij = x) = s{x). Therefore, we propose the 
following median quantile estimate of s{x): 



(4) Sfe„(x) = argminVV||yjj-(5i-/ib,^(x)| - 9\Kh„{x 



X) 



where /in > is another bandwidth and jibn{x) is a bias-corrected jack- 
knife estimator. Similarly, we construct a bias-corrected jackknife estimator 
of s{x) with 

(5) ~Shr, (x) = 2sh„ (x) - s^^^ (x) , 

which is uniformly consistent and asymptotically normal, as shown in Wei, 
Zhao and Lin (2011). This stepwise estimation is also used in He (1997) 
under the constraint that both /u(x) and s(x) are linear. 
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Bandwidth selection. To implement the proposed methods, one needs to 
select the proper band widths 6„ and /i„. Popular choices include plug-in 
methods, cross-validation (CV) and generalized CV methods. Originally de- 
veloped for independent data, CV methods often tend to undersmooth cor- 
related data [Opsomer, Wang and Yang (2001)]. However, in the absence of 
universally efficient alternatives, the CV methods are still among the most 
widely used [Fan and Yao (2003), Li and Racine (2007)]. This paper also 
adopts the CV approach. 

The basic idea is to leave one profile out and fit the model using the re- 
maining profiles. We then choose the optimal bandwidth that minimizes the 
prediction error. Classical CV methods deal with quadratic losses, whereas 
here we work with the L-1 loss penalty. Hence, we propose selecting the 
bandwidth 6„ by minimizing the following modified CV criterion, that is, 

n mi 

(6) K = argmin^^ |yjj - 6i - flb„-i{xij)\, 

where fib„,-i is the bias-corrected jackknife estimator of fi based on all but 
the ith profile. Similarly, we choose /i„ with 

n rrii 

(7) /i; = argmin^^||yij-(5i-/2b„| - Sh„-^{x^J)\, 

where s^^^^i is defined in the same way as fJn,n-i- 

Alternative smoothing approaches. Other nonparametric estimation meth- 
ods, such as smoothing splines, wavelets and normalized B-splines, can also 
be used to estimate jiix) and s(x), respectively, in steps 2 and 3. In gen- 
eral, the estimated ^{x) and s{x) using other smoothing techniques are also 
consistent, although their limiting distributions need to be investigated sep- 
arately. The proposed CV criterion can also be adapted to choose other 
smoothing parameters. We refer to Opsomer, Wang and Yang (2001) for 
a detailed comparison of various smoothing methods. 

2.2. Construct profile control charts. Suppose {(x;,y/),/ = 1,2, ... ,?77.*} 
is a new profile from Phase II, where m* is the number of measurements of 
the new profile. We would like to test whether it is different from the Phase I 
profiles. Specifically, we are interested in testing the null hypothesis, 

Hq : The new profile comes from the same profile population, 

against the alternative hypothesis Ha, that the new profile comes from a dif- 
ferent population. 

Three deviation measures and their control limits. A new profile can 
differ from the reference profiles in two ways: through a vertical shift or 
a shape change. We propose three deviation measures: one for the first type 
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and two for the latter. The profile control charts consist of the screening 
thresholds of the three deviation measures. 

To monitor the vertical outliers, we first estimate the center of the new 
profile, and denote it as 6* . We then standardize it by 

D = \6* - fis\/ss, 

where fis and sg are, respectively, the sample median and MAD of the 
Phase I profile centers Jj's. The deviation score D then provides a rela- 
tive ranking of the new profile, from the inside to the outside, with respect 
to the Phase I profiles. To determine the screening threshold for D, we can 
generate its reference distribution by the empirical (or bootstrap) distribu- 
tion of the di = \6i — jj-sl/ss- Consequently, we can use the (1 — a)th upper 
quantile of the dj's as the control limit and denote it c^^'{a). Here a is 
the significance level and will be determined at a later step. If the devia- 
tion score D exceeds c('')(a), then the profile will be singled out due to its 
outlying location relative to the Phase I profiles. 

It is more challenging to screen shape deviations. We first center the 
profile by {Yj* =Yi — 6* ,1 = 1, . . . , m*}. The center of the new profile 1^* is 
then zero. By construction, the reference profile iJ,{x) is also centered at zero, 
which means there are no systemic distances between the two profiles after 
the centering step. Consequently, the main differences between Y^* and /i(x) 
are only due to their different shapes. 

Recall that jlb^ and Sh„ are the bias-corrected estimates of /x(x) and s{x). 
We define 

(8) 6/:=-!-: — , l<l<m , 

which measures the relative deviation of Y^ from the estimated jibni^i) 
given the estimated scale function Sh^{xi). We thus consider the following 
deviation measures: 

m* 

(9) r(^)= max \ei\ and T^^^=y^\ei\. 

KKm* ^-^ 
/=! 

The first statistic T^^' measures the maximal local shape deviation of the 
new profile from the estimated reference profile, while the second score T^' 
measures its cumulative shape deviation. The two scores complement each 
other, since one monitors the overall shape change while the other monitors 
local perturbations. Together, the scores provide a comprehensive monitor- 
ing of shapes. Gardner et al. (1997) also consider estimating a "reference" 
surface, using the sum of the residuals to detect unusual signals. 

To determine the screening thresholds for the two measures, we first need 
to derive the distributions of T^'^> and T^'^' under the null hypothesis. Letting 
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ei = {Yi — 6 — fj,{xi)} / s{xi) be the error of the new profile, the theorem cited 
below establishes the asymptotic distributions of T^^^ and T^^' . Here T^^' 
has an asymptotic extreme value distribution [Galambos (1987)] due to its 
maximum structure, while T'^^ is asymptotically normally distributed. 

Theorem 1. Let Nn be as in Condition 2 (in the Appendix). Define 

- _,4 , ,4 , O^g^V^ (logiV^ 

^" "- + ''"+ (iV.6„)V2 + {N^KV/^- 
Under Conditions 1-4 T^'^ ^^6 Appendix), the following statements hold: 
(i) Under Hq, as n^ oo, m* — )• oo, 

where Z has the extreme value distribution F, and {'^m*^Pm*)m*&i is a non- 
random sequence with -jm* i 0, such that P{maxi<Km. |e/| < 7m* a^ + Pm*} = 
F{x) for all continuity points x of F and '^nPm* hm* — ^ 0. 

(ii) Let eo be an independent and identically distributed (i.i.d.) copy of ei 
and Condition 1 (in the Appendix) holds with q = 2. Assume that the density 
function of eo is bounded, and further assume that H„m*^'^ — ?• as n,m* — )■ 
oo. Then, under Hq, as n,m* — )• oo, 



T(2) 



m fi 



where 



N{0,a^ 



(10) ;[i = E(|eo|) and ct^ = Var(|eo|) + 2^Cov(|eo|, |e,|) < oo. 

1=1 

The conditions for Theorem 1 are summarized in the Appendix, and these 
conditions are rather common in practice. The proofs are presented in the 
supplementary materials [Wei, Zhao and Lin (2011)]. 

The limiting behavior of T^^' is substantially determined by the distri- 
bution and dependence characteristics of the underlying ei. Therefore, it is 
quite challenging to obtain the critical value (i.e., screening threshold) of T*-^) 
by using Theorem 1 directly. For T^"^' , on the other hand, although Theo- 
rem 1 guarantees its normal limiting distribution through (ii), estimating fi 
and 0"^ can be computationally complicated. Hence, we propose estimating 
the control limits numerically based on the Phase I data. Specifically, we 
calculate the shape deviation scores as in (9) for individual Phase I profiles 
that have measurements on {xi,l = 1, . . . , m*} and denote them Tj^i and Ti^2, 
respectively. The screening thresholds are then the (1 — a)th quantiles of Tj^i 
and Ti^2- We denote the two screening thresholds by c^^'{a) and c^'^\a), re- 
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spectively. The control limits can be obtained by bootstrapping the Phase I 
profiles and using the (1 — a)th bootstrap quantiles as the desired screening 
thresholds. 

The derived control limits are valid due to the following reasons: (1) De- 
spite the difficulties of obtaining asymptotic critical values directly, The- 
orem 1 establishes the fact that, under Hq, both statistics T^^' and T^^' 
converge to certain stable limiting distributions as the number of Phase I 
profiles goes to infinity. (2) Assuming that the functions /i(-) and ■s(-) are 
known, then, under Hq, the distribution of e/ of the new profile is the same 
as that of ei^i from the Phase I profiles, where Ci^i = {Yi^i — ii{xi^i)}/s{xi^i). 
Hence, the limiting distributions of T'^' and T^^-* can be well approximated 
by the empirical distribution of max;|ej^/|'s and ^^ Icj^^l's, respectively, with 
sufficiently large n. Due to the uniform convergence of jlb^ and Sh„, the dis- 
tribution of ii^i is uniformly and sufficiently close to that of Cj^/, with large 
enough n and sufficiently small bn and hn- Combining the above facts, one 
can generate the reference distribution of T^^' and T^^' by the empirical (or 
bootstrap) distributions of Tj^i and Ti^2- 

Determining the significance level a. In the proceeding steps, we leave 
the significance level a unspecified and write the three screening thresh- 
olds c^^'{a), c^^>{a) and c(^)(a) as functions of a. We now choose a such 
that 



(11) 



< Q : y ^maxjlj 



a* =max<^ Q:2^max{l|rf^>^(o)(„)},l|7.^^>c(i)(„)},l|7.^2>c(2)(a)}} <'^"0 >, 

where ao is the desired overall significance level. Following the definition 
above, a profile will be considered as an outlier if it appears unusual in 
any of the three domains. And a* is the largest value that ensures that the 
probability of falsely detecting a normative profile is less than ao- 

Summary of the screening procedure. Suppose {xi,Yi) is a new profile. 
Then the screening consists of the following three steps: 

(i) Center the profile by its median 6* and calculate its relative vertical 
deviation by 

D = \6* - jlsl/ss. 

(ii) Calculate the cumulative and maximal shape deviation of the cen- 
tered profile, Yi — 6* , with respect to /i6„(x) and Sh„{x): 



T^ ' = max 

KKm* 



yi - ^* - P'bAxi) 



Sh„ixi) 



T(2) = V" 



1=1 



Yi - 6* - flb„ixi) 



ShAxi) 



10 
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(iii) If either of D, T^^' and T^'^' exceeds its corresponding screening 
threshold, c^^'{a*), c^^'{a*) and c^'^'{a*), respectively, then the profile {xi,Yi) 
will be singled out as a potential outlier. 

3. Application. 

3.1. Profile control chart applied to VDP data. We now return to the 
VDP data to illustrate the proposed profile screening procedure. Recall 
that the density of the wood board (Y) is measured on a dense sequence 
of depths {X) across the board. The VDP data consist of 24 such pro- 
files, and we view them as Phase I profiles. We first calculate the center of 
Yi = {Yi^i, . . . , l^,m.); which is its overall median 6i = median(li). The profile- 
specific centers 5i represent the overall vertical locations of those profiles and 
are presented in Figure 2(a). 

After centering, we estimate the reference profile and reference deviation 
functions /i(x) and s{x) following the proposed estimation procedure. Based 
on the CV criterion in (6) and (7), the optimal bandwidths for 6„ and hn are 
0.015 and 0.01, respectively. The resulting /ib* (i) and Sh^{t) are presented 
in Figure 2(b) and are depicted as solid (reference profile) and dashed (de- 
viation profiles) curves. 

Following the proposed method to construct the control charts, we first 
calculate the relative vertical deviation scores (di) of the 24 profiles (plot- 
ted in the left panel of Figure 3) . We then calculate the defined cumulative 
and maximal deviation scores (Tj^i and Tj^2) for 24 profiles (plotted in the 
middle and the right panels of Figure 3). The control limits are determined 
by assuming a significance level a = 0.03 for each deviation score, which 
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Fig. 2. VDP profiles. Panel (a) presents the centers of the profiles, while panel (b) 
plots the centered profiles together with their estimated reference profile (solid curve) and 
estimated /i(a;) ± s{x) (dotted lines). 
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Fig. 3. Profile charts of relative vertical deviation and post- centering maxima 
mutative deviations. The dotted lines are determined control limits. 



and cu- 



yields an overall significance level oq = 0.12 ~ 3/24. The resulting screening 
thresholds are 2.99, 7.94 and 663.6, respectively. These control limits are 
presented by the gray dotted horizontal lines in Figure 3. As shown in Fig- 
ure 3, Profile A6 is associated with the largest vertical deviation. Profile A3 
is the farthest outlier in the maximal shape deviation, while Profile B6 has 
the largest cumulative shape deviation of all the centered profiles. Although 
they are assumed to be normative profiles, these profiles provide insights 
on what the maximum tolerated vertical and shape deviations are. We plot 
these curves in their original forms in Figure 4. It is clear that Profile A6 has 
a lower vertical density than all the other profiles, Profile B6 is "too flat" in 
the center section compared to the rest of the profiles, while Profile A3 has 
stronger local turbulence than others. Although the VDP data have been 
used in various studies, our conclusion provides some new insights into the 
data set. 

3.2. Comparisons with other approaches. The VDP data have been in- 
vestigated by many others. Most approaches are based on linear profile as- 
sumptions, as in Woodall et al. (2004), Kim, Mahmoud and Woodall (2003), 
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Fig. 4. Three profiles associated with the largest D, T'^' and T^'^' . 
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Mahmoud et al. (2007), Kang and Albin (2000) and Zhu and Lin (2010). 
Although those hnear models may be applicable in some situations, they are 
clearly inappropriate for the VDP data due to their nonlinear nature. Other 
nonlinear profile approaches, on the other hand, tend to be more compli- 
cated, and some are hard to implement in practice. Two of the most rele- 
vant approaches to our work are the bathtub model proposed by Williams, 
Woodall and Birch (2007) and the x^ control charts of Zhang and Albin 
(2009) and Shiau et al. (2009). We elaborate these authors' approaches be- 
low and discuss how they differ from our methods. 

Williams, Woodall and Birch (2007) fit a bathtub function 



f{xij,l3) 



o.i{xij — c)^ + d, Xj > c; 

)^2 + d, Xj < c, 



a2{Xij -c 



to each profile (for Profiles #1,2, . . . ,24), yielding 24 sets of estimates for 
a six-dimensional vector of parameter /3 = (ai , 02 , 6i , 62 > c, d) . The authors 
then construct (i) six univariate control charts for each of the parameters 
(ai, . . . ,d) and (ii) a multivariate T^ control chart for the vector of /3. Based 
on those control charts, they conclude (page 934) that boards #4, 9, 15, 18 
and 24 have outlying profiles. Note that our identified farthest outlying Pro- 
files A3, A6 and B6 correspond to Profiles ^3, ^^10 and ^^15 in their setup. 
Although only #15 is detected by Williams, Woodall and Birch (2007), the 
authors do point out (page 935) that Profiles #6 and #3 should be outliers 
as well, which is consistent with our conclusion. Since Williams, Woodall 
and Birch (2007) restrict the shapes of the profiles to a family of bathtub 
models, the bathtub model may exhibit a certain lack of fit in some profiles, 
which could, in turn, lead to failure in detecting Profiles #6 and #3. More- 
over, the bathtub model suffers from an identifiability issue that affects the 
control charts based on it. That is, two distinct sets of parameters can yield 
nearly identical bathtub curves. 

Zhang and Albin (2009) assume that all the profiles are measured on 
a fixed grid of locations/times. This way, the profiles can be viewed as 
long vectors. Consequently, one can construct a x^ control chart based on 
the individual quadratic distances with respect to an estimated mean vec- 
tor (fis) and an estimated variance-covariance matrix (Ss), that is, Aj = 
{Ui — i^s)'^~^^{yi — (j's)- When the profiles are densely measured, the variance- 
covariance matrix S can be close to singular, which makes the estimation 
challenging. Using this approach, Zhang and Albin (2009) identify Profi- 
les #3, 6, 9, 10 and 15 as outliers. This fully covers our findings of Pro- 
files #3, 10 and 15. Shiau et al. (2009) further apply functional principal 
component analysis to individually smoothed profiles and then monitor po- 
tential outliers based on the quadratic distance of the major principal com- 
ponent scores. Compared to these approaches in Shiau et al. (2009) and 
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Zhang and Albin (2009), the proposed charts have the following two ad- 
vantages: First, the proposed method has the flexibility to accommodate 
random locations, in which case the profiles can be observed for unevenly 
spaced and individual sets of locations. Second, the proposed approach de- 
composes the potential deviations into three domains: location, shape, and 
local disturbance. Consequently, the screening results provide more informa- 
tion with which to detect outliers. In this sense, our proposed approaches can 
be viewed as "targeted" screening, compared to these "generic" screening 
approaches. 

In addition, the proposed model has different setup and noise structure 
from the nonparametric mixed effect model in Qiu, Zou and Wang (2010). 
Specifically, they assumed the model i/ij = g{xij) + fi{xij) + eij, where g is 
the population profile function, fi is the random-effects term due to the 
ith individual profile, and e^j's are i.i.d. random errors with mean and 
variance c^. Essentially, in the proposed model (1), we further write out the 
random function fi{xij) as 6i + s{xij)e{xij), where e(-) is a random process 
with a general correlation structure. This specification does not reduce the 
model flexibility, and makes it easier to handle the heteroscedasticity and 
location shifts. 

3.3. Numerical investigation using synthetic VDP profile-like data. To 
investigate the numerical performance of the proposed method, we generate 
synthetic data sets that mimic the VDP data, based on which we evaluate 
the screening power of the proposed control charts. To simulate VDP-like 
data, we choose the same set of locations (x) as in the VDP data, which 
range from to 0.626, with a grid length of 0.002. We then generate 100 
individual density profiles at the chosen locations based on the following 
model: 

(12) Yi{x) = 6i + TT{x)^ao + ei(x), 

where tt{x) is an eight-dimensional quadratic B-spline basis function with 
internal knots (0.06, 0.16, 0.31, 0.47, 0.56), which are the 0.1th, 0.25th, 
0.5th, 0.75th and 0.9th sample quantiles of the locations x in the VDP data, 
respectively. We further assume that 5i are i.i.d. random coefficients that 
follow a normal distribution A^(0, cr|). Finally, we consider two stochastic 
processes for the error term ei{x): (1) ei{x) ~ N{0,a'^) and (2) ei{x) follows 
a scaled t distribution with three degrees of freedom and variance a^. In 
addition, we assume an exponentially decay correlation structure for both 
error processes, that is, 

(13) corr(ej(3;),ej(x')) =exp{— 8|x — x'|}. 

Note that, in both cases, the median ej(x) = for all x, hence, 6i is the 
median location by construction, that is, the center of the profiles. The 
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Fig. 5. Simulated profiles. The profiles in panel (a) are generated from the Gaussian 
process, while those in panel (b) follow a scaled is process. 



generated profile Yi{x) then follows model (1) with median location (the 
center) Si, reference function fi{x) = tt{x) cxq and reference deviation func- 
tion s(x) = 1. For sensible choices of ctQ, it| and o"^, we estimate them from 
the original VDP data. Specifically, we regress individual profiles over 71(2;) 
using L-1 regression and denote the resulting coefficient estimate Qj. We 
choose otQ by the sample mean of Qj, choose (t| by the sample variance of 
the 6i obtained in the preceding section, and then choose a as the stan- 
dard deviation of the pooled residuals. Figure 5(a) displays the generated 
profiles Yi{x) under the two error distributions. We see that the profiles gen- 
erated from the t process have more turbulence than those generated from 
the Gaussian process. 

Screening. We next construct the control charts using the proposed me- 
thod. When estimating the reference and deviation functions, we choose the 
optimal CV bandwidths bn and /i„ as 0.004 and 0.007, respectively, for the 
Gaussian e{x), and 0.01 and 0.007, respectively, for the ^3 distributed e(t). 
The three control limits are obtained assuming the overall significance level 
of 0.05 following equation (11). To investigate the screening power of the 
proposed control charts for the two sets of profiles above, we generate an- 
other 100 profiles from the true model (12) and from each of the following 
two "wrong" models: 

Model (a) Yi{x) = Oj + 7r(x)~''Q! + Asin(107rx) + ei{x), 

Model (b) Yi{x) = a^ + irix^cx + Bcpix - 0.3)/0.005 + ei{x), 



where Oj and ei{x) follow the same error process in the true model, either the 
Gaussian or the ^3 distribution, and (/>(•) is the density function of a standard 
normal. Model (a) distorts the shape of the profile by adding a sine curve 
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Fig. 6. Simulated density profiles from the misspecified models (a) and (b). The dotted 
curve is the true reference profile, the solid black curves are simulated paths from the ts 
process, and the solid dark gray curves are those generated from the Gaussian process. 



with the coefficient A determining its amphtude, while model (b) introduces 
local "spiky" noise to the profile. The noise level is determined by the coef- 
ficient B. We consider the coefficients A = 0.75, 1 and 1.25 for model (a), 
and B = 0.02, 0.03 and 0.04 for model (b), representing smaller to larger 
contamination scales. Figure 6 displays the simulated profiles from the two 
misspecified models with the Gaussian and ts processes, respectively. As we 
can see in Figure 6, when the coefficients A and B increase, the severity 
of the noise also increases. We then apply the proposed screening proce- 
dure, with the resulting proportions of successful identification presented in 
Table 1. When the proffies are generated from the true model, the false dis- 
covery rates are 0.05 (±0.02) and 0.08 (±0.02), respectively. They are close 
to their nominal level 0.05. For misspecified models, the screening power 
increases with the amplitude of the noise. In both cases, we have decent 
power to detect moderate to larger profile deviations. When the error e(t) 
follows the ts distribution, the proffies fluctuate more, and, consequently, 
the screening power is lower than that of the Gaussian process. 
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Table 1 

The percent of the profiles that exceed the control limits. Under the true model, type I 

error is reported. Under models (a) and (b), the detection rate is reported 













Model (a) 






Model (b) 






True modelA 


= 0.75A= l.OOA 


= 1.25B 


= 0.02B = 0.03B 


= 0.04 


Gaussian e(i) 
ts distributed 


e{t) 


5% 
8% 


45% 
21% 


80% 
62% 


98% 
100% 


36% 

12% 


82% 
44% 


100% 
95% 



4. Discussion. This paper proposes to screen the shape of the profiles 
using a nonparametric location-scale L-1 model. The basic idea is to estimate 
a reference profile curve, based on which we can rank the differences of 
individual profiles from the reference profile. The nonparametric components 
of the proposed model provide sufficient flexibility to capture the shape of 
the profiles. In addition, as inherited from L-1 regression, we do not assume 
any specific distributions for the profiles, and the resulting estimates are 
fairly robust against the heavy-tail distributions and data contamination. 

L-1 versus L-2 screenings. The proposed screening procedure relies on 
L-1 regression. However, it remains valid if one replaces L-1 regression 
with L-2 regression (least-squares regression), which is more commonly used 
for control charts. Specifically, to achieve L-2 screening, we redefine ^i in 
model (1) as the conditional mean of the ith profile, and define ^(x) and s{x), 
respectively, as the conditional mean and standard deviation functions of 
Yi{x) — 5i. The centers 6i can be estimated by the sample mean of the 
ith. profile, and the functions fi{x) and s{x) can be estimated using ker- 
nel smoothing to replace the absolute value functions in (2) and (4) with 
square functions. Consequently, we define the deviation score di by the stan- 
dardized 6i, using its mean and standard deviation, and keep the deviation 
scores Ti and T2 in the same form, except that /x(x) and s{x) are now the 
conditional mean and standard deviation functions. The L-2 screening could 
be more efficient and effective for normative profiles, but the L-1 screening 
is known to be more robust if the Phase I profiles contain potential outliers. 
Practitioners can choose according to their specific needs. 

Profile monitoring is a relatively new area, but it is growing rapidly, as in- 
dicated by increasing numbers of practical applications. The proposed meth- 
ods are shown to be favorable when applied to the previously analyzed VDP 
data. They can be further generalized to reach a wider range of applications. 
First of all, we rank the shape deviation of an individual profile from the 
reference profile based on its largest residual and cumulative residuals. The 
asymptotic distributions of the resulting deviation scores are studied. De- 
pending on the application, alternative deviation scores can be used. For 
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example, instead of the largest residual, we can use the 90th percentile of 
the residuals. The properties of other deviation scores would be of research 
interest. Second, the proposed charts consist of three metrics, monitoring de- 
viations in location, shape and local disturbances, respectively. If one type of 
deviation is of major concern, one could focus on one metric (more targeted 
screening) to achieve better screening power. Third, in some applications, 
the data may exhibit specific features. Incorporating these special features 
into the estimation can further improve efficiency. For example, suppose that 
the VDP profile is assumed to be symmetric around the midpoint Cx- If we 
assume that the location function ^{x) is a smooth differentiable function, 
a more efficient way to estimate it is to regress yij over the transformed 
location x* . = \xij — Cx\, that is, estimate the conditional median func- 
tion inedian{yij\x* ■) = g{x* ,) with the constraint ^'(O) =0. Consequently, 
the location function ^{x) equals g{\x — Cx\)- Fourth, we assume in the pro- 
posed model that the errors are in a stationary random sequence. The devel- 
oped theories can be further generalized to incorporate more general error 
sequences. For example, Cij could be a linear combination of k station- 
ary error sequences. Fifth, the control limits are estimated from data, and 
hence subject to certain estimation errors [Jensen et al. (2006)]. When there 
are insufficient Phase I profiles, some corrections may be needed to ensure 
good properties of the proposed control charts. Finally, note that we use 
Phase I profiles to generate the reference distributions of the shape devi- 
ation scores. This approach implicitly assumes that there are a sufficient 
number of Phase I profiles that have been measured at similar locations as 
the profile to be screened. In the case of sparse profile data with irregular 
spacing, we may not have sufficient Phase I profiles to ensure a stable esti- 
mation of the screening thresholds. Further research is needed to deal with 
such sparse profile data. 

APPENDIX: CONDITIONS OF THEOREM 1 

Recall that, for a given i, {ei,j}i<j<mi is the error process of the zth 
profile. We assume that there exist an independent process {e^j^gz, such 
that the ith error profile {ei,j}i<j<mi can be viewed as one realization of 
the process {e£}i<i<rni- We assume that {e^j^gz has the representation 

(14) ei = G{ei,ei±i,ee±2,---), 

where £i,i G Z, are i.i.d. random vectors, and G is a measurable function 
such that ei is well defined. The representation (14) can be viewed as an 
input-output system with (e^ , ei±i , ££±2, ■ ■ ■), G and ei being the input, trans- 
form or filter, and output, respectively. This representation allows for non- 
causal models, and is hence particularly useful for our applications, which 
do not have a time structure. 
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For q> and a random variable e we denote by ||e||q = [E(|e|'^)]^''^ the Lg 
norm. We further assume the following: 

Condition 1 (Dependence condition). Let eo be as in (14), and (e^)^gz 
be an i.i.d. copy of {ee)e£Z- There exist q> and p G (0, 1) such that for all 

kGN, 

||eo-eo(fc)||g = 0(/) 
(15) 



where eo{k) = G(eo,e±i, • ■ • ,e±fc,e±(fc+i)>e±(fcH 



■2)' 



In (15), eo{k) can be viewed as a coupling process of cq with £r being 
coupled by the i.i.d. copy e[. for \r\ > {k + 1), while keeping the nearest 
2/c + 1 innovations e^ with \r\ < k. In particular, if cq does not depend 
on (er)|r|>(fc+i)i then eo{k) = cq. Thus, ||eo — eo(A;)||g can be viewed as the 
impact of {£r)\r\>{k+i) oil Co- Intuitively, it is reasonable to expect that mea- 
surements sufficiently far away would have negligible impact. In particular, 
condition (15) states that the impact decays exponentially as the location 
space k increases. As shown in Lemma 1 of Wei, Zhao and Lin (2011), (15) 
implies that the correlation between cq and eg decays exponentially as i 
increases. 

Condition 2 (Location condition). The set of measurement locations 
{^i,j ) 1 ^ J < "T-ji 1 ^ ^ ^ ^} is asymptotically uniformly dense in [a, b] . Specif- 
ically, let a = xq < xi < • • • < xn„ < xm^+i = h he the ordered locations, 

where N^ = mi -\ \- ?7t,.„ is the total number of measurements. We assume 

that maxo<fc<Ar„|xfc+i - Xfc - ^| = 0{N~'^). 

Condition 3 (Kernel condition). Let K,u,-,u) > 0, be the set of kernels 
which are bounded, symmetric, and have bounded support [— w, w] with boun- 
ded derivative. Let fx = f^K'^{'^)du and tpx = f^u^ K (u) du/2, K gJC^}- 
The kernel K E /C^ . 

Condition 4 (Smoothness condition). Denote by Fg and fe = F^,, re- 
spectively, the distribution and density functions of cq in (14). Assume /i, 
sGC4([G,6]),inf,e[„,b]s(x)>0,/eGC4(M),/e(0)>0,/e(l) + /e(-l)>0. He- 
re C^{S) is the set of 4th order continuously differentiable functions on set S. 

Condition 2 says that the pooled locations of measurements should be 
reasonably uniformly dense, even though each single profile may only con- 
tain sparse measurements. The remaining two conditions are also fairly gen- 
eral, and commonly assumed in kernel smoothing. Under Conditions 1-4 
and S„ — )• (see Theorem 1 for definition) , the location and scale function 
estimates, flb^ and s^^, are uniformly consistent. 



PROFILE CHARTS WITH NONPARAMETRIC L-1 REGRESSION 19 

SUPPLEMENTARY MATERIAL 

Proof of Theorem 1 (DOL 10.1214/11-AOAS501SUPP; .pdf). The tech- 
nical proof of Theorem 1 is provided in the supplementary material. 
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